What is claimed is : 

1 . A voice recognition system comprising a spectrum 
converter for elongating or contracting the spectrum of 
a voice signal on the frequency axis, the spectrum 
converter including : 

an analyzer for Converting an input voice signal to 
an input pattern including cepstrum; 

a reference pattern memory with reference patterns 
stored therein; / 

an elongation/contracting estimating unit for 



:ion^co 



outputting an elongation/contraction parameter in the 



frequency axis direction by using the input pattern and 
the reference patterns; and 

a converter k or converting the input pattern by using 
the elongation/contraction parameter . 



pf? A voice recognition system comprising: 

an analyzer for converting an input voice signal to 
an input pattern including a cepstrum; 

a reference pattern memory for storing reference 
patterns; / 

an elongation/contraction estimating unit for 
outputting an elongation/contraction parameter in the 
frequency axis direction by using the input pattern and 
reference patterns; 

a conve :ter for converting the input pattern by using 
the elongation/contraction parameter; and 

a matching unit for computing the distances between 
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the elongated or contracted input pattern f ed^out from the 
converter and the reference patterns and/outputting the 
reference pattern corresponding to the/Shortest distance 
as result of recognition. 



3. The voice recognition system according to claim 
1 or— 3t wherein the converter^executes the elongation or 
contraction of spectrum on frequency axis with warping 
function defining the form/of elongation or contraction 
by carrying out the elongation or contraction in cepstrum 
space. 



4. The voice /recognition system according to ono 
of claims 1 to 3 , /wherein the elongation/contraction 
estimating unit executes the elongation or contraction of 
spectrum on frequency axis with warping function defining 
the form of elongation or contraction by using estimation 
derived from tme best likelihood estimation of HMM (hidden 
Marcov model) in cepstrum space. 



reference pattern learning system comprising: 
a ^earning voice memory with learning voice data 
stored therein; 

in analyzer for receiving a learning voice signal 
from the learning voice memory and converting the learning 
voic^ signal to an input pattern including cepstrum; 

a reference pattern memory with reference patterns 
stcfred therein; 
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an elongation/contraction estimatingTinit for 
outputting an elongation/contraction parameter in 
frequency axis direction by using the i,nput pattern and 
the reference patterns; 

a converter for converting the/input pattern by using 
the elongation /contract ion pattern; 

a reference pattern estimating unit for updating the 
reference patterns stored in the reference pattern memory 
for the learning voice dat'a by using the elongated or 
contracted input pattern/fed out from the converter and 
the reference patterns /and 

a likelihood judging unit for monitoring distance 
changes by computing/distances by using the elongated or 
contracted input pattern fed out from the converter and 
the reference pat/terns . 




6 . The reference pattern learning system according 
to claim 5, wherein the converter executes the elongation 
or contraction of spectrum on frequency axis with warping 
function defining the form of elongation or contraction 
by carryir/g out the elongation or contraction in cepstrum 
space. 



lL The reference pattern learning system according 
to claim 5 ae— &^ wherein the elongation/contraction 
estimating unit executes the elongation or contraction of 
specjcrum on frequency axis with warping function defining 
the/form of elongation or contraction by using estimation 



36 



derived from the best likelihood estimation op HMM (hidden 
Marcov model) in cepstrum space. 



A voice quality converting jsystem comprising: 
an analyzer for converting an iLnput voice signal to 
an input pattern including a cep/trum; 

a reference pattern memory for storing reference 
patterns ; 

an elongation/contraction estimating unit for 
outputting an elongation/contraction parameter in the 
frequency axis direct iopi by using the input pattern and 
reference patterns; 

a converter f ©reconverting the input pattern by using 
the elongation/contraction parameter; and 

an inverse converter for outputting a signal 
waveform in time^ciomain by inversely converting the time 
serial input pa/ttern obtained after the 
elongation/contraction supplied from the converter. 

A recording medium for a computer constituting 
a spectrum /converter by executing elongation or 
contraction of the spectrum of a voice signal on frequency 
axis, in /which is stored a program for executing the 
following processes: 

(a) an analyzing process for converting an input 
voice signal to an input pattern including cepstrum, 

/(b) an elongation/contraction estimating process 
for o/itputting an elongation/contraction parameter in 
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frequency axis direction by using the input pattern and 
reference patterns scored in a reference pattern memory; 
and 

(c) a converting process for converting the input 
pattern by using A;he elongation/contraction parameter, 



^>e^ A recording medium for a computer constituting 
a system for voi/ce recognition by executing elongation or 
contraction of A:he spectrum of a voice signal on frequency 
axis, in which is stored a program for executing the 
following processes: 

(a) an /analyzing process for converting an input 
voice signal/ to an input pattern including cepstrum, 

(b) aiJ elongation/contraction estimating process 
for outputting an elongation/contraction parameter in 
frequency aicis direction by using the input pattern and 
reference patterns stored in a reference pattern memory; 

(c) 4 converting process for converting the input 
pattern by/ using the elongation/contraction parameter; 
and 

(d) la matching process for computing the distances 
between the elongated or contracted input pattern and the 
reference; patterns and outputting the reference pattern 
corresponding to the shortest distance as result of 



recognit 



j. on • 



111 The recording medium according to claim 10 , 
wherein t£he converting process executes the elongation or 
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contraction of spectrum on frequency axiS with warping 
function defining the form of elongation or contraction 
by carrying out the elongation or contraction in cepstrum 
space. 

12. The recording medium/ according to claim 10, 
wherein the elongation/contraction estimating process 
executes the elongation or /contraction of spectrum on 
frequency axis with warping function defining the form of 
elongation or contract ior/by using estimation derived from 
the best likelihood estimation of HMM (hidden Marcov 
model) in cepstrum space. 



^3^. In a computer constituting a system for 
learning reference patterns from learning voice data, a 
recording medium, in which is stored a program, for 
executing the following processes: 

(a) an analyzing process for receiving learning 
voice data from learning voice memory with learning voice 
data stored /therein and converting the received learning 
voice data /to an input pattern including cepstrum; 

(b) /an elongation/contraction estimating process 
for outpurtting an elongation/contraction parameter in 
frequency axis direction by using the input pattern and 
the reference patterns stored in the reference pattern 
memory v 

fc) a converting process for converting the input 
patteiln by using the elongation/contraction parameter; 
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(d) a reference pattern estimating process for 
updating the reference patterns for /the learning voice 
data by using the elongated or contracted pattern fed out 
in the converting process and the/reference patterns and; 

(e) a likelihood judging process for calculating the 
distances between the elongated or contracted input 
pattern after conversion in the converting process and the 
reference patterns and mcmitoring changes in distance. 



14. The recording medium according to claim 13 , 
wherein the converting process executes the elongation or 
contraction of spectrum on frequency axis with warping 
function defining /the form of elongation or contraction 
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by carrying out t£e elongation or contraction in cepstrum 
space . 



15. The recording medium according to claim 13, 
wherein the ^elongation/contraction estimating process 
executes thjfe elongation or contraction of spectrum on 
frequency dxLs with warping function defining the form of 
elongation or contraction by using estimation derived from 
the best/likelihood estimation of HMM (hidden Marcov 
model) in cepstrum space. 



A recording medium for a computer constituting 
a spectrum conversion by executing elongation or 
contraction of the spectrum of a voice signal on frequency 
axis, in which is stored a program for executing the 
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following processes: / 

(a) an analyzing process for converting ari input 
voice signal to an input pattern including cepstrum, 

(b) an elongation/contraction estimating process 
for outputting an elongation/contraction/parameter in 
frequency axis direction by using the ^i>nput pattern and 
reference patterns stored in a reference pattern 
memory; (c) a converting proces^ for converting the 
input pattern by using the elongation/contraction 
parameter; and 

(d) an inverse converting^ process for outputting a 
signal waveform in time domain/by inversely converting the 
time serial input pattern obtained after the 
elongation/contraction supplied from the converter. 



A spectrum converting method for elongating or 
contracting the spectrum of a voice signal on the frequency 
axis, comprising: 

a first step 4on converting an input voice signal 
to an input pattern including cepstrum; 

a second step for outputting an 
elongation/contraction parameter in the frequency axis 
direction by us/ing the input pattern and the reference 
patterns stored in a reference pattern memory; and 

a third/ step for converting the input pattern by 
using the eyongation/contraction parameter. 

/a voice recognition method comprising: 
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a first step for converting an input voice signal 
to an input pattern including a cepstrum; 

a second step for outputting an 
elongation/contraction parameter in the frequency axis 
direction by using the Lnput pattern and reference 
patterns stored in a reference pattern memory; 

a third step foy converting the input pattern by 
using the elongation/contraction parameter; and 

a fourth step tor computing the distances between 
the elongated or contracted input pattern and the 
reference patterns /and outputting the reference pattern 
corresponding to t/he shortest distance as result of 
recognition. 

19 . The vo/i_ce recognition method according to claim 
17 wherein/the elongation or contraction of spectrum 

on frequency axi's with warping function defining the form 
of elongation one contraction is executed by carrying out 
the elongation/ or contraction in cepstrum space. 



20. The voice recognition method according to one 
of nlnimn 17 /to "Hy wherein the elongation/contraction 
estimating pi^ocess executes the elongation or contraction 
of spectrum on frequency axis with warping function 
defining the form of elongation or contraction by using 
estimation derived from the best likelihood estimation of 
HMM ( hidden/ Marcov model) in cepstrum space. 
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^2^^" A reference pattern learning method? 
comprising: 

a first step for receiving a learning voice signal 
from the learning voice memory and converting the learning 
voice signal to an input pattern incl / uding cepstrum; 

a second step for outputting /an 
elongation/contraction parameter /in frequency axis 
direction by using the input pattern and the reference 
patterns stored in a reference pattern memory; 

a third step for converting the input pattern by 
using the elongation/contraction pattern; 

a fourth step for updating the reference patterns 
for the learning voice /data by using the elongated or 
contracted input pattern and the reference patterns; and 

a fifth step fpr monitoring distance changes by 
computing distances/by using the elongated or contracted 
input pattern and /the reference patterns. 

22. The reference pattern learning method 
according to cl/aim 21 , wherein the third step executes the 
elongation or/ contraction of spectrum on frequency axis 
with warping/ function defining the form of elongation or 
contraction/ by carrying out the elongation or contraction 
in cepstriim space. 

23 / The reference pattern learning method 
according to claim 21 , wherein the second step executes 
the elongation or contraction of spectrum on frequency 



43 



* 



axis with warping function defining the form of elongation 
or contraction by/ using estimation derived from the best 
likelihood estin/ation of HMM (hidden Marcov model) in 
cepstrum space. 



vQice recognition method of spectrum 
conversion to /convert the spectrum of a voice signal by 
executing elongation or contraction of the spectrum on 
frequency axis, wherein: 

the spectrum elongation or contraction of the input 
voice signau as defined by a warping function is executed 
on cepstrum, the extent of elongation or contraction of 
the spectrum on the frequency axis is determined with 
elongatiop/contraction parameter included in warping 
function; and an optimum value is determined as 
elongation/contraction parameter value for each speaker. 
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